Mon. Not. R. Astron. Soc. 000, 000-000 (0000) 



Printed 25 May 2012 (MN BT E X style file v2.2) 



Foreground Removal using FASTICA: A Showcase of 
LOFAR-EoR 

Emma Chapman, 1 * Filipe B. Abdalla, 1 Geraint Harker, 2 ' 3 Vibor Jelic, 4 
Panagiotis Labropoulos, 4 ' 5 Saleem Zaroubi, 5 Michiel A. Brentjens, 4 
A.G. de Bruyn, 4 ' 5 L.V.E.Koopmans, 5 

1 Department of Physics & Astronomy, University College London, Gower Street, London, WC1E 6BT 

2 Center for Astrophysics and Space Astronomy, 389 UCB, University of Colorado, Boulder, CO 80309-0389, USA 
3 NASA Lunar Science Institute, NASA Ames Research Center, Moffett Field, CA 94035, USA 

4 ASTRON, PO Box 2, NL-7990AA Dwingeloo, the Netherlands 

5 Kapteyn Astronomical Institute, University of Groningen, PO Box 800, 9700AV Groningen, the Netherlands 



25 May 2012 



ABSTRACT 

We introduce a new implementation of the FASTICA algorithm on simulated LOFAR- 
EoR data with the aim of accurately removing the foregrounds and extracting the 21- 
cm reionization signal. We find that the method successfully removes the foregrounds 
with an average fitting error of 0.5 per cent and that the 2D and 3D power spectra are 
recovered across the frequency range. We find that for scales above several PSF scales 
the 21-cm variance is successfully recovered though there is evidence of noise leakage 
into the reconstructed foreground components. We find that this blind independent 
component analysis technique provides encouraging results without the danger of prior 
foreground assumptions. 

Key words: cosmology: theory - dark ages, reionization, first stars - diffuse radiation 
- methods: statistical. 



1 INTRODUCTION 

Four hundred thousand years after the Big Bang, the re- 
combination of electrons and protons resulted in a neutral 
Universe, steadily cooling with the Hubble expansion. The 
'Dark Ages' followed recombination until, 400 million years 
after the Big Bang, the first ionizing sources came into ex- 
istence. This Epoch of Reionization (EoR) is one of the last 
unobserved eras of our Universe, but with a new generation 
of radio telescopes coming online (e.g. Low Frequency Array 
(LOFARQ Giant Metrewave Radio Telescope (GMRT^ 
Murchison Widefield Array (MWA^Precision Array to 
Probe the Epoch of Reionization (PAPERQ 21 Centime- 
ter Array (21CMAQ Square Kilometre Array (SKA0 this 
is soon about to change. 

It is generally accepted that the most rewarding way 
to probe reionization is through the 21-cm spectral line - 



produced by a spin flip in neutral hydrogen (van de Hulst 
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l945}|Ewen fc Purcell|poT| |Muller fc Oort||1951| ). This 21 



cm radiation can be observed interferometrically at radio 
wavelengths as a deviation from the brightness temperature 



oftheCMB (Field|1958 


Field 1959 Madau, Meiksin & Rees 


1997 Shaver et al.|1999 


)• 



Observationally, the 21-cm signal will be accompanied 
by system noise and Galactic and extragalactic foregrounds 
(e.g. |Jelic et al.||2008| |2010[ ) which are orders of magnitude 
larger than the 21-cm signal we wish to detect. On top of 
this there are systematic effects due to the ionosphere and 
instrument response. The foregrounds must be carefully re- 
moved using a cleaning process of high accuracy and pre- 
cision as any error at this stage has the ability to destroy 
the underlying 21-cm signal. Foreground removal and the 
implications for 21-cm cosmology has been extensively re- 
searched over the past decade (e.g 



Di Matteo et al. 



2002 



Oh fc Mack||2003l |DTMatteo, Ciardi fc Miniati||2004| |Zal- 



darriaga, Furlanetto fc Hernq uist 2004; Morales fc Hewitt 
2004[ |Santos, Cooray fc Knox||2005| |Wang et al.||2006| |M~ 
Quinri et al.|2006[| Jelic et al.|2008||Gleser, Nusser fc Benson 
2008| |Bowman, Morales fc Hewitt |2006| |Harker et al.|200'9T 



Liu, 



cowman, iviorajes^^^iewrcrizuuojjriarKe^^i^iijzuuy 
'egmark fc Zaldarriaga||2009] |Liu et al.||2009| |Harker 



et al.|2010||Liu fc Tegmark|2011||Petrovic fc 6h|2011||Mao 
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2012} |Liu fc Tegmark|2012| ). This paper constitutes only one 
step in the foreground removal process and assumes that 
bright sources have been removed, for example via a flux 
cut ( |Di Matteo et al.|2004 l. 

There is currently no consensus on the most effective 
foreground removal method, though a recent method imple- 
mented by |Harker et al. |(2009 2010J has shown promising 
results while making only minimal assumptions. The same 
method highlighted that foreground removal techniques can 
be carried out both in image and visibility space - with 
the quality of results sometimes dependent on the choice 
of space. It is possible that different methods will be best 
suited for the extraction of different information from the 
data and improvements in the recovery of any one statis- 
tic, even at the expense of another, will still be useful. The 
statistical detection of the EoR signal is fraught with un- 
certainty and applying several foreground cleaning methods 
to the data independently will be invaluable in confirming a 
statistical detection. 

The method presented here is based on the independent 



component analysis (ICA) algorithm, FASTICA (Hyvarinen, 



Karhunen & Oja 20011. ICA is a method originally de- 



signed to separate independent signals with minimal prior 
knowledge of the form of the signals. Thus ICA provides us 
with a foreground removal method which compensates for 
the fact that we do not know the form of the foregrounds 
at the exact resolution and frequency range of LOFAR. A 
non-parametric method, FASTICA allows the foregrounds to 
choose their own shape instead of assuming a specific form, 
such as a polynomial. FASTICA is a versatile tool and has 



been applied recently in the field of exoplanets ( Waldmann 



2012 1 and CMB foreground removal with great success (e.g. 



Maino et al.|[2002| |Maino et al.|[2003| |Maino et al.[ 2007; 



Bottino, Banday fe Maino 2008; Bottino, Banday fe Maino| 



2010) motivating its implementation on other cosmological 



data. The results presented focus on the two main statisti- 
cal aims of current EoR experiments, namely the recovery of 
the power spectrum and variance of the cosmological signal. 

Section [2] briefly describes the FASTICA methodology 
and algorithm used to identify independent components 
(ICs). The various methods used to simulate the 21-cm sig- 
nal, foregrounds and noise are set out in Section [3] The 
results and sensitivity of the FASTICA method are presented 
in Sections [4] and [5] before a final summary and discussion 
in Section |6] 



2 FOREGROUND REMOVAL TECHNIQUES 

The statistical detection of the 21-cm reionization signal de- 
pends on an accurate and robust method for removing the 
foregrounds from the total signal. Since it is impossible to 
observe the foregrounds alone this is not a simple task. 

The first attempts focused on exploiting the angular 
fluctuations of th e 21-cm signal (e.g. [TJi Matteo et al.|2002| 



Oh fe Mack|2003||Di Matteo et al.|2004[ ), but the 21-cm sig- 
nal was found to be swamped by various foregrounds. The 
focus then moved on to the frequency correlation of the fore- 
grounds, with the cross-correlation of pairs of maps used as 
a cleaning step (Zaldarri aga et aT] 2004 San tos et al.|2005 |. 
While the foregrounds are expected to be highly correlated 



teo et al. 20021, the cosmological signal is expected to be 
highly uncorrelated ( Ali, Bharadwaj & Chengalur 2008 1 on 
the same frequency scales, allowing frequency correlation to 
differentiate the signals. 

As more methods have emerged, it has become clear 
that different methods have different advantages and fore- 
ground subtraction has become accepted as a three stage 
process. The first and last stages are bright source removal 



( Cooray fc Furl anctto 2 004] |Di Matteo et al.||2004| |Zaldar 
|riaga et al.||2004| |Morales, Bowman fc Hewitt||2006[ ) and 
residual error subtraction ( Morales fc Hewitt|2004 l and are 
not dealt with in this paper. For our data we assume that 
the first stage has been carried out and all bright sources 
have been removed, which will still leave foregrounds strong 
enough to swamp the 21-cm signal ( Di Matteo et alT]|2002 



Oh & Mack 2003]) . The second stage is known as spectral fit- 
ting, or line of sight fitting, and has become by far the most 
popular in literature. Line of sight methods can be divided 
into subcategories of parametric and non-parametric meth- 
ods. The majority of literature involves parametric methods, 
whereby at some point a certain form for the foregrounds is 
assumed, for example polynomials (e.g. |Santos et al 



Wang et all2006 McQuinn et al.|2006| [Bowman et al 



2005 



2006 



Jelic et al.|2008 ||Gleser et al.|2008||Liu, Tegmark fc Zaldar 



|riaga|2009||Liu et al.|2009||Petrovic fc Oh|2011| ). In contrast^ 
non-parametric methods allow the data to determine the 
form of the foregrounds with many more free parameters, 
allowing much more freedom and not assuming a specific 
form. This has obvious advantages for a cosmological era so 
far not directly observed but results are often not as promis- 
ing as parametric results. A possible exception is the recent 



method presented by |Harker et al. ( |2009||2010[ ) which prefer- 
entially considered foreground models with as few inflection 
points as possible. When applied to LOFAR-EoR data, this 
method compared very favourably with parametric meth- 
ods. FASTICA is another non-parametric method which we 
will show produces similarly promising results. 



2.1 The FASTICA method 
2.1.1 Background 

Introduced in the early 1980s, ICA has established itself as a 
successful component separation technique with widespread 
applications. The method relies on the assumption that the 
multiple elements making up a mixed signal are statistically 
independent. 

ICA methods often formulate the data model as: 



x = As 



(1) 



on scales of 1 MHz (e.g. Gnedin fc Shaver 2004 Di Mat- 



where a; is a vector representing the observed signal, s a 
vector of which the components are assumed mutually in- 
dependent and A a mixing matrix to be calculated. For our 
data we have signal maps of 512 x 512 pixels at 170 dif- 
ferent frequencies. Equation [l] represents one line of sight 
where, if m ICs are assumed, the sizes of x, A and s are 
[170,1], [170, m] and [m,l] respectively. Actually, FASTICA si- 
multaneously considers all lines of sight, so x and s are in 
effect matrices of size [170,512 x 512] and [m,512 x 512] 
respectively. For clarity, we will set out the description as if 
only one line of sight was being considered but the reader 
should bear in mind that all lines of sight are simultaneously 
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and independently treated by the algorithm, with A being 
independent of the line of sight. 

It might immediately strike the reader that the model 
specified here is the noise free ICA model. This is because 
this implementation makes no effort to model the noise com- 
ponent through the x — As + n formulation. Instead one 
must appreciate that it is the way in which FASTICA is not 
robust to noisy components that we take advantage of here. 
Whereas x will represent the observed signal of foreground, 
noise and 21-cm signal, s is considered to be the foregrounds 
only. FASTICA ignores the Gaussian or non-smooth spectral 
components in the observed signal. When we specify m ICs, 
FASTICA reconstructs m ICs relating to the foregrounds only. 

To solve Equation [I] for s, we seek a linear transform: 

s = \Nx (2) 

where W is a constant weight matrix that the ICA method 
aims to determine assuming the elements of s are as statis- 
tically independent as possible. 

FASTICA seeks to estimate W using the concept of mu- 
tual information. An outline of the general philosophy be- 
hind FASTICA is outlined below, but for a full treatment refer 
to 



Hyvarinen (19991 and Hyvarinen et al. (20011 



Let us consider a single component of the signal s: 

WiXi (3) 



y = w x = 



where if w is one of the rows of the inverse of A, y is actu- 
ally one of the ICs, Si. ICA then attempts to minimise the 
Gaussianity of w T x. To understand why, we define a vector 
z : 

z = A T w (4) 

so that we have a weighted sum of the independent signal 
components: 

T 

y = z s 



(5) 



The central limit theorem states that the greater the number 
of independent variables in a distribution, the more Gaus- 
sian that distribution will be. z T s is therefore always more 
Gaussian than any individual Sj. y will be least Gaussian 
when one, and only one, Zi is non-zero, and in such a case 
y is then one of the ICs. Thus by maximising the non- 
Gaussianity of w T x we find one of the ICs. 

In order to estimate and adjust w T x in such a way 
that its Gaussianity converges to a minimum, the methods 
needs a robust measure of non-Gaussianity. FASTICA favours 
negentropy as a measure of non-Gaussianity, which is based 
on the idea of the entropy of a variable, H(y): 



H(y) 



P(y = di)logP(y 



where a; are the possible values of y. 
Negentropy is then defined as: 



J(y) = H(y s 



H(y) 



(6) 



(7) 



where t/ ga uss is a random Gaussian variable with the same 
covariance matrix as y. Using the maximum-entropy princi- 
ple, one can define: 



J(y) » ^ ki[E{Gi{v)} - E{Gi(v)}f 



(8) 



where hi are positive constants, v is a Gaussian variable 
with zero mean and unit variance and G are non-quadratic 
functions. Though almost any non-quadratic function can 
be used, the robustness and speed of the fastica method 
depends on choosing these contrast functions well, with dif- 
ferent contrast functions more suited to different scenarios. 
For this implementation we choose a non linearity, g(u) of: 



g(u) = u x exp 



(9) 



where g(u) — G'(u) = dG j^> . This choice is particularly 
suited when robustness is important or when the compo- 
nents have high kurtosis. 

Since s and A are both unknown, FASTICA cannot deter- 
mine the ICs' magnitudes or order, as we can freely change 
the order of the components in the mixing model or multi- 
ply any of them by a scalar factor which can be balanced 
out by dividing out elsewhere. As such FASTICA fixes the 
magnitudes of the ICs by assuming they have unit variance. 



2.1.2 Algorithm 

Here we summarise the fixed-point FASTICA algorithm for 
finding one IC. 

The mixed signal is input along with a parameter repre- 
senting the number of ICs we assume there to be. A typical 
choice in this implementation is four ICs. 

This data undergoes several preprocessing steps within 
the FASTICA program. First the data are adjusted to be of 
zero mean to simplify the algorithm. Then, using a principal 
component analysis to estimate the eigenvalues and eigen- 
vectors for the data, the data are whitened. This results in 
the vector x where the components are uncorrelated, with 
unit variances. 

We wish to choose a unit vector w such that the non- 
Gaussianity of w T x is maximized. Under the assumption 
that the components have unit variance (which for whitened 
data is equivalent to assuming | j w j | 2 = 1 ) these maxima occur 
where: 



E{xg(w T x)} - pw = 



(10) 



is satisfied. To find the roots of this equation using Newton's 
method we arrive at the approximate Newton iteration: 



w = w 



[E{xg(w T x)} - 0w] 
E{g'( W Tx)} - p 



(11) 



This iteration is carried out using the algorithm sum- 
marised in the following steps (Hyvarinen & Oja 2000): 



(i) Choose an initial random weight vector w 

(ii) Let w + = E{xg(w T x)} - E{g'(w T x)}w 

(iii) Let w = 

(iv) If the old and new values for w are not converged 
repeat the process 

where g is the derivative and g' is the second derivative 
of the chosen contrast function G. The use of the contrast 
function derivatives comes from consideration of where the 
maxima of the negentropy approximation are obtained. The 
non-Gaussianity is maximised along the line of sight and 
across the map simultaneously meaning that the method's 
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constraining power benefits from having more pixels and 
more frequency maps. 

To extend the algorithm to n components requires FAS- 
TICA to run simultaneously for n different weight vectors, 
wi, ...w n where one Wj corresponds to w in the above algo- 
rithm. To ensure that the different WjX converge to different 
maxima (i.e. the same IC is not found twice) all the outputs 
WjX must be decorrelated after every iteration. For a more 



detailed treatment refer to Hyvarinen ( 1999 1 and Hyvarinen 
etHlpOOll. 



2.1.3 Our Implementation 

We make use of the C++ implementation of FASTICA pro- 
vided by the IT++ librarjQ Our foreground subtraction 
proceeds in the following steps: 

(i) Read in the simulation data cube and specify the num- 
ber of foreground ICs for FASTICA to model. 

(ii) Call FASTICA to calculate the mixing matrix and ICs 
of the foregrounds. 

(iii) Reconstruct the foregrounds by performing a multi- 
plication of the mixing matrix, which is common to all lines 
of sight, with the vector of ICs for each line of sight. 

(iv) Find the difference between the reconstructed fore- 
ground cube and the input cube. This residual cube should 
equal the 21-cm signal, noise and and any foreground fitting 
errors. 

Statistical tests can then be carried out on the residuals 
cube to determine if the 21-cm signal is recoverable after the 
foreground removal process. 

It is worth reiterating once more as it such an impor- 
tant point: the ICs referred to in the ICA methodology as 
applied here are the ICs making up the foregrounds - the 
cosmological signal and noise are at no point modelled or 
even taken into account by this FASTICA implementation. 



et al. 



20111 and initialised at 2=300 on a 1800 3 grid. The 



velocity fields used to perturb the initial conditions as well 
as the resulting 21-cm Tb boxes were formed on a cruder 
grid of 450 3 before being interpolated up to 512 3 . We define 
halos contributing ionizing photons as having a minimum 
virial mass of 1 x 1O 9 M0. 

When a hydrogen atom undergoes a ground state hy- 
perfine transition from the excited triplet state, where the 
spins of the proton and electron are parallel, to the singlet 
state, where the spins are anti-parallel, a photon is emitted 
of wavelength 21-cm or frequency v\q = 1420 MHz. The 21- 
cm spectral line is forbidden - the probability of a 21-cm 
transition is 2.9 x 10 -15 s _1 , equivalent to triplet state life- 



time of 10 7 years ( Wild|l9"52 1. Even so, the vast amounts of 
hydrogen in the Universe lead to 21-cm observations being 
achievable (van de Hulst|1945| ). The intensity of 21 -cm radi- 



ation is determined by the spin temperature, T sp i n , defined 



through (Field] 1958 1 



m 
no 



= 3 exp 



_ T * 



(12) 



The spin temperature is a fundamental measure of the num- 
ber densities of the triplet and singlet states (ni, no) where 
T* = = 0.0681 K. 

Tb is detected differentially as a deviation from Tqmb, 
5T b ( |Field|1958[|T959"l|Ciardi fc Madau|2003"l ): 



ST b = 28 mK x (1 + 8)x m 1 



Tcmi 

T s pi n 



n b h 2 \ 

x 

0.0223 I 



0.24 \ 



(13) 



where S is the mass density contrast, h is the Hubble con- 
stant in units of 100 kms -1 MPc -1 , xm is the fraction of 
neutral hydrogen and __(, and £l m are the baryon and mat- 
ter densities in critical density units. 



3 SIMULATED EOR DATA 

We simulate 170 frequency maps between 115 and 200 MHz 
with spacings of 0.5 MHz. The maps consisted of 512 2 pixels 
representing a 10° x 10° observation window, or a resolution 
of 1.17 arcminutes per pixel. Since an interferometer like 
LOFAR is insensitive to the mean value of the brightness 
temperature, we use mean-subtracted maps. 



3.1 21-cm Cosmological Signal 

Of the existing reionization simulation programs (e.g. 
Santos et al.|20f0 I , we use the semi- numeric modelling tool 
21cmFAST to simulate the observable of the 21-cm radia- 



tion, the brightness temperature Tb ( Mesinger & Furlanetto 



2007| |Mesinger, Furlanetto fc Cen||2011[ ). 21cmFAST treats 
physical processes with approximate methods for small 
realisation generation times and has produced results 
which agree well with the most recent hydrodynamical 
simulations. The code was run using a standard cosmology, 



(Ov, fim, n b , n, as, ft)=(0.72,0. 28,0. 046,0.96,0. 82, 73) ( Komat; ;u 



7 http:/ /itpp. sourceforge.net/devel/fastica_8cpp. html 



3.2 Foregrounds 

Though there have been foreground observations at frequen- 
cies r elevant to LOFAR using WSRT ( |Bernardi et al.|2009| 
| 2010[ ) the foreground contamination at the frequencies and 
resolution of LOFAR remains poorly constrained. As a re- 
sult, foreground models directly relevant for this paper rely 
on using constraints from observations at different frequency 
and resolution ranges. These constraints are used to normal- 
ize the necessary extrapolations made from observations to 
create a model relevant for LOFAR-EoR observations. 

In general, the foreground components are modelled as 
power laws in 3+1 dimensions (i.e. three spatial and fre 
quency) such that T D oc u 13 (e.g. 



2008 Jelic et al. 2008 2010 1 



Shaver et al.|1999 



Ali et al. 



The foreground simulations used in this paper are ob- 
tained using the foreground models described in Jelic et al. 
(|2008 20101. The foreground contributions considered in 



these simulations are: 

(i) Galactic diffuse synchrotron emission (GDSE) origi- 
nating from the interaction of free electrons with the Galac- 
tic magnetic field. Incorporates both the spatial and fre- 
quency variation of ft by simulating in 3 spatial and 1 fre- 
quency dimension before integrating over the z-coordinatc 
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to get a series of frequency maps. Each line of sight has a 
slightly different power law. 

(ii) Galactic localised synchrotron emission originating 
from supernovae remnants (SNRs). Together with the 
GDSE, this emission makes up 70 per cent of the total fore- 
ground contamination. Two SNRs were randomly placed as 
discs per 5° observing window, with properties such as power 
law index chosen randomly from the Green ( 2006 1 catalog 

(iii) Galactic diffuse free-free emission due to 
bremsstrahlung radiation in diffuse ionised Galactic 
gas. This emission contributes only 1 per cent of total 
foreground contamination, however it still dominates the 
21-cm signal. The same method as used for the GDSE is 
used to obtain maps, however the value of /3 is fixed to 
-2.15 across the map. 

(iv) Extragalactic foregrounds consisting of contributions 
from radio galaxies and radio clusters and contributing 27 
per cent of the total foreground contamination. The simu- 
lated radio galaxies assume a power law and are clustered us- 
ing a random walk algorithm. The radio clusters have steep 
power spectra and are based on a cluster catalogue from 
the Virgo consort iurrj^] and observed mass-luminosity and 
X-ray-radio luminosity relations. 



Table 1. LOFAR and simulation parameters 



Unlike Jelic et al. (2008 20101, this paper does not con- 



sider the polarisation of the foregrounds. The foregrounds 
simulated here are up to five orders of magnitude larger than 
the signal we hope to detect. Since interferometers such as 
LOFAR measure only fluctuations, foreground fluctuations 
dominate by 'only' three orders of magnitude. We will in- 
vestigate polarized removal in a further analysis. 



3.3 Noise 

The sensitivity of a receiving system is ultimately deter- 
mined by the system noise (iThompson, Moran & Swenson 



Jr. 2001 1. The system noise consists of contributions from 
both the sky and the receivers themselves. Frequency de- 
pendence is introduced into the noise through the sky noise 
and through the frequency dependence of the effective area 
of the telescope. It is assumed that the noise across an image 
is Gaussian at any one frequency. For a detailed look at the 
expected noise on LOFAR measurements, see |Labropoulos| 
et al.M2009l. 



We have decided to reproduce the method for calculat- 
ing the noise here as the noise can have a significant effect 
on foreground cleaning methods. 

Our parameters for calculating the noise are listed in 
Table [l] In order to create a noise curve, Fig. [l] we use the 
following prescription: 

The system noise temperature consists of sky brightness and 
instrumental components. We calculate this system temper- 
ature using: 



T aya = 140 + 60 



(3OO MHz) 



(14) 



The effective area of the array is determined by multiplying 
the effective area of a single dipole by the number of dipoles 
in the array where the effective area of a dipole is limited 



8 http:/ /www. mrao.cam.ac.uk/surveys/snrs/ 

9 http://www.virgo.dur.ac.uk/ 



Parameter 


Description 


Value 




number of dipoles per tile 


16 


nt 


number of tiles per station 


21 


n s 


number of stations 


48 


Va 


antenna efficiency 


1 




system efficiency 


0.9 


Av 


frequency interval [MHz] 


0.5 


tint 


integration time [h] 


600 




area of synthesized beam [arcmin 2 ] 


4 




area of synthesized beam [sr] 


1.35 x 10~ 6 




120 130 140 150 160 170 180 190 200 
V/MHZ 

Figure 1. The rms of the simulated zero mean noise (blue;solid) 
and 21-cm (red;dash) maps over frequency. 



by the size of the tile. We calculate the effective area of the 
LOFAR array using: 



A B ff = min 



3 ' 



1.5625 ndnt 



(15) 



The System Equivalent Flux Density (SEFD) then depends 
on both of the quantities calculated above: 

Finally we calculate the LOFAR noise sensitivity: 
1 SEFD 



SEED = - LByskb 



(16) 



(17) 



Fig. [T] shows the noise curve calculated from this pre- 
scription compared to the rms of our 21-cm simulation. 

For each frequency a LOFAR measurement set was filled 
with Gaussian noise in the uv plane. This was then imaged 
to create a real-space image, the root mean square of which 
can be normalized to the value as given by the prescription 
described above. For example the noise sensitivity at 150 
MHz for an integration time of 600 h and a frequency spacing 
of 0.5 MHz is 64 mK. The 170 noise maps were uncorrelated 
over frequency - i.e. a different noise realization was used to 
fill the measurement set for each frequency. 
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8 T. / K 
b 

0.02 0.04 




Degrees 

Figure 2. The 21-cm signal at 150 MHz, convolved with the 
PSF. The signal is entirely in emission - this map has been ad- 
justed to have a mean of zero to reflect the observations of an 
interferometer. 



5T. IK 
b 



-10-5 5 10 




Degrees 

Figure 3. The total contribution of the simulated foregrounds at 
150 MHz, convolved with the PSF. 



3.4 Dirty Images 

The success of an interferometer such as LOFAR is highly 
dependent on how uv space is sampled. The particular pat- 
tern of uv sampling forms a beam which affects how the 
components such as the foregrounds are seen by the inter- 
ferometer. Dirty images were simulated by convolving with 
the PSF of the LOFAR set up used to simulate the noise in 
the previous section, Fig. [2] and Fig. [3] 

The PSF used for creating dirty images (and for cre- 
ating the noise as described in the previous section) was 
chosen to be the worst in the observation bandwidth - i.e. 
the PSF at 115 MHz. In observations the synthesized beam 
decreases in size with increasing frequency, causing point 
source signals to oscillate with the beam, producing a fore- 
ground signal with an oscillatory signal very much like that 
of the 21-cm signal. However, this mode-mixing contribu- 
tion has been found not to threaten the 21-cm recovery and 
have a power well below the 21-cm level (Bowman et al 



2006 Liu, Tegmark & Zaldarriaga 2009|. As such we leave 



the consideration of a frequency dependent PSF to a future 
paper. 

Once the foregrounds and 21-cm signal have been ad- 
justed for uv sampling, the three component cubes are added 
together. The components of the total STh along a random 
line of sight are shown in Fig. [4] 



3.5 Fourier Transformed Data 

The FASTICA method was implemented separately with data 
both in real and Fourier space. For the latter method, the 
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Figure 4. The redshift evolution of the simulated cosmological 
signal (red; dash dot), foregrounds (pinkjsolid), noise (blue; dash) 
and total combined signal (black; dot). All components have un- 
dergone the PSF convolution. Note the 21-cm signal has been 
amplified by 10 and displaced by -IK for clarity. 



fiducial image cube was 2D Fourier transformed at each fre- 
quency to create a Fourier data cube. The complete cube was 
then processed with FASTICA and the output reverse Fourier 
transformed to obtain the ICs in real space. Unless otherwise 
stated all results refer to real space implementation. 
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Figure 5. In the top panel we show the four columns of the mix- 
ing matrix representing the four ICs. The brightness temperatures 
of the foreground contributions along a random line of sight are 
shown in the bottom panel. We see that the ICs are each a scaled 
mixture of the foreground contributions. 



4 RESULTS 

4.1 The Independent Components 

The top panel of Fig.[5]shows the four ICs found by fastica 
for a clean data cube. These ICs are the columns of the 
mixing matrix, A. For comparison we show the line of sight 
5Tb of the simulated foreground contributions in the bottom 
panel of Fig. [5] 

We can see that no single component corresponds to 
any one single foreground contribution, even when process- 
ing a clean data cube. Instead, the components are all a 
mixture. While in ICs 2 and 4 the presence of Galactic syn- 
chrotron is obvious, in the other components the combina- 
tion of the contributions is not so clear. It is also worth 
noting that while IC4 shows a significant contribution from 
Galactic synchrotron, it is inverted. FASTICA can only deter- 
mine the ICs up to a multiplicative constant and so the sign 
and magnitude of the components are irrelevant. 

The coefficients of the ICs are stored in the matrix s 
and are presented in Fig. [6] Fig. [7] Fig. [8] and Fig. [9] We 
can compare these coefficients to the maps of the foreground 
contributions, Fig. [To] Fig. [TT] and Fig. [12] We see that all 
four coefficients are a mixture of the contributions as ex- 
pected. 



4.2 Fitting Errors and Variance 

We will first discuss the FASTICA results on the simulation 
where the data cube has been convolved with the PSF, the 
data processing is carried out in real space and four ICs 
are assumed. The word 'simulated' is used to refer to the 
input maps and 'reconstructed' is used for the estimates 
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Figure 6. The first coefficient map of the ICs when FASTICA 
processes the clean data cube. 
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Figure 7. The second coefficient map of the ICs when FASTICA 
processes the clean data cube. 



50 
100 
150 
200 h 
250 
300 
350 
400 
450 
500 



100 



200 



300 



400 



500 



Figure 8. The third coefficient map of the ICs when FASTICA 
processes the clean data cube. 
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Figure 9. The fourth coefficient map of the ICs when FASTICA 
processes the clean data cube. 
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Figure 10. The simulated cxtragalactic foregrounds at 150 MHz. 



resulting from FASTICA. The total input signal is separated 
into reconstructed foregrounds and residuals. The residuals 
are the difference between the original mixed signal and the 
reconstructed foregrounds. 

To evaluate the accuracy of the foreground fitting by 
FASTICA, we calculated the foreground fitting error, Equa- 
tion [18] for each pixel. 

r>. , ■ fgrcconstructcd fgsimulatcd r.r. r. \ 

fitting error = — - x 100.0 (18) 

tgsimulated 

In Fig. [13] we plot the Pearson correlation coefficient 
between the foreground fitting errors and foregrounds (top) 
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Figure 11. The simulated Galactic synchrotron foregrounds at 
150 MHz. 
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Figure 12. The simulated Galactic free-free foregrounds at 150 
MHz. 
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Figure 13. a) The Pearson correlation coefficient between the 
foreground maps and foreground fitting errors, b) The Pearson 
correlation coefficient between the noise maps and foreground fit- 
ting errors. 
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Figure 14. The variance across the combined simulated cosmo- 
logical signal and noise (red, dash), noise alone (black, dot) and 
residuals (blue, solid). 



and between the foreground fitting errors and the noise (bot- 
tom). The Pearson correlation coefficient between two data 
sets a and b is defined as: 



T,j( a i - a )( b » -b) 

[£i(a<-a) 3 Ei(k-5) a ]' 



(19) 



where a is the mean of the data set dj, b is the mean of the 
data set bi and the measure is normalized such that r = ±1 
for correlation/anti-correlation. 

We see that there is very little correlation between the 
foreground maps and the foreground fitting errors, with 
around six magnitudes more correlation between the noise 
maps and the foreground fitting errors. 

To get a representation of the foreground fitting error 
over an entire map, the rms error of the fitted foregrounds 
was calculated, Fig. |f 7| ft should be noted that this error 
takes into account all scales - including those with a dispro- 
portionate error as will be seen in the power spectra. The 
rms difference between the simulated and reconstructed fore- 
grounds was calculated over all 5f 2 2 lines of sight for each 
frequency. Also, an rms error for each map was calculated 
using only 68 % of the pixels - with the pixels of lowest error 
selected first. When the outlier pixels are discounted we find 
that the rms error is below 10 mK for the majority of the 
frequency range. This is still high enough to be of concern 
as the 21-cm signal is itself of order tens of mK, however the 
inclusion of all scales means this is a worst case scenario. 

For a statistical detection of the Epoch of Reionization, 
LOFAR aims to detect a non-zero variance after the noise 
and foregrounds have been accounted for. We begin by com- 
bining the simulated noise and simulated 21-cm signal and 
taking the variance of this signal. This can then be com- 
pared to the variance of the FASTICA residuals - Fig. |14| 
The residual variance is recovered at all but the smallest 
frequencies. At frequencies below 120 MHz (or z > 10.8) 
the variance is significantly underestimated, probably as a 
result of foreground overfitting - the leakage of noise power 
into the estimated foreground power. This failure at very 



low frequencies is hardly surprising considering that this is 
where the noise and foregrounds are at their strongest. 

We subtract the variance of the simulated noise directly 
from the variance of these residuals: var (reconstructed 21- 
cm) = var(residuals) - var(noise). This is a fair assumption 
as we should be able to look at the data in narrow frequency 
bins and estimate the statistics (e.g. variance and power 
spectrum) of the noise to a very high accuracy. 

We find that the recovered 21-cm variance, Fig. |15| top- 
left, is not robust to small scale power in the original signal. 
By removing the noise simulation maps manually from the 
residual maps in order to get crude maps of the recovered 
21-cm signal, excess small scale power is evident, Fig. |16| 
We note that we do this direct noise subtraction for a crude 
visual inspection only and not for any of the analytical re- 
sults. The excess power is most likely due to FASTICA not 
being robust to the small scale power (noise) in our data, 
allowing it to leak into the reconstructed foregrounds, ft was 
found that by Fourier filtering the data to entirely remove 
k modes below a threshold corresponding to a multiple of 
the PSF scale, the variance recovery was significantly im- 
proved. A very good recovery occurs with filtering below 5 



PSF scales (Fig. 15 bottom-right). 

At the extremes of the frequency range the recon- 
structed variance increasingly diverges from that of the sim- 
ulated 21-cm. Both the noise and the foregrounds are at 
their largest at lower frequencies meaning that both fitting 
errors and noise leakage is likely to be largest here, leading 
to less accurate 21-cm reconstruction. Equally, at the larger 
frequencies, the 21-cm signal is almost non-existent mak- 
ing an accurate reconstruction difficult when swamped with 
fitting errors and noise. 

This variance calculation was also carried out on a data 
cube where the residual and noise maps were smoothed from 
a 512 2 grid to a 256 2 grid before the same variance calcu- 
lation was carried out above and compared to the variance 
of a smoothed simulated 21-cm map. The curves are, as ex- 
pected, slightly smoother however the trend and conclusions 
remain the same. 
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Figure 15. The variance across the simulated (red; dash) and 
reconstructed 21-cm maps (blue; solid) for the fiducial data and 
data which has had Fourier filtering of modes below 2,3 and 5 
PSF scales (in reading order). 
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Figure 17. The rms error of the 4 IC reconstructed foregrounds 
for when all pixels are considered (blue;dash) and when only 
the middle 68 per cent of the error distribution is included 
(blue;solid). Also, the rms errors of the reconstructed foregrounds 
for FASTICA applied according to models with 2 (red; dot) and 6 
(black; dashdot) ICs, with only the middle 68 per cent of the error 
distribution included. 
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Figure 16. The reconstructed 21-cm signal at 150 MHz for dirty 
data. We see that while there is a strong correlation between the 
large scale structure in this image and the original signal, Fig. 
[2] there is also a large amount of excess small scale structure, 
probably due to noise leakage into the foregrounds. 



Figure 18. The variance across the simulated (red; dash) and 
reconstructed maps at each frequency, for the FASTICA algorithm 
run with the assumption of 2 (black; dot), 4 (blue; solid) and 6 
(pink; dot dash) ICs. The data has been Fourier filtered at the 5 
PSF scale. 



4- 2.1 Varying the Number of ICs 

The FASTICA algorithm requires specification of the number 
of ICs to be used in the reconstructed foreground model. 
Though we have modelled the various foreground contribu- 
tions, it is not a trivial task to determine how these depend 
on each other and to what degree. To test the sensitivity of 
our results to the number of ICs chosen we calculate the rms 



error and variance recovery for IC numbers of 2, 4 and 6 in 
Fig. [17] and Fig. [18] 

We see that small variations in the number of ICs does 
not endanger the statistical recovery of the 21-cm signal. For 
the remainder of this paper, four ICs are assumed. 
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4.3 Power Spectra 

Together with the variance, EoR experiments aim to recover 
the power spectrum of the cosmological signal over a broad 
range of frequencies. 

Different effects are important for modes parallel and 
perpendicular to the line of sight. For example, consider the 
scenario where the foregrounds have been under-fitted by a 
constant over the frequency range. This offset will not be 
evident in the ID power spectrum of the residuals, however 
will be evident in the angular power spectrum if that con- 
stant is dependent on line of sight. Thus it has been argued 



by Harker et al. (20101 that for LOFAR data, the sepa- 



rate calculation of ID and 2D power spectra has its advan- 
tages. However this does not consider modes neither parallel 
or perpendicular to the line of sight and as such we calcu- 
late 2D and 3D power spectra. We note here that we only 
performed one simulation of the cosmological signal so the 
power spectrum error bars relate to this specific realisation 
of the density field. 



4-3.1 Angular Power Spectra 

The angular power spectrum of a map at a single frequency 
is calculated by 2D Fourier transforming that map and bin- 
ning the pixels according to Fourier scale, k. The power at 
any particular k, {5(k)S* (k)) is the average power of all the 
uv cells in the bin centering on k. The error on the point for 
a particular bin, ki are calculated as cti — ili!h)JLJJh)l w here 



rifc i is the number of uv cells that reside in that k bin. The 
power spectra were averaged over frequency bandwidths of 
2.5 MHz and all frequencies quoted are the middle frequency 
of the bandwidth. The power spectrum of the reconstructed 
cosmological signal is calculated by subtraction of the noise 
power spectrum from the FASTICA residuals power spectrum. 
The error on the simulated 21-cm power spectrum is added 
in quadrature with the error on the noise to reflect the error 
on the reconstructed 21-cm power spectrum. Note that we 
assume Gaussianity whereas the 21cm signal is not Gaus- 
sian and also we calculate the error bars from the power of 
a single realization rather than over an ensemble of simula- 
tions. We ask the reader to bear in mind that these error 
bars might be considered incomplete because of this. 

The quantity actually plotted is A2 D (k) — 



where A is the area of the simulation 



Ak- ! {S(k)S"(k)) 

map. 

Fig. [19] shows the extent to which the FASTICA method 
can recover the 21-cm angular power spectrum. Overall, the 
21-cm power spectrum is convincingly recovered across the 
redshift range. Any points where the power of the residuals 
are below the power of the noise are omitted, as this leads to 
an unrealistic negative reconstructed 21-cm power. As such, 
there is a lack of data at small scales indicative of noise 
leakage into the foregrounds. 

This noise leakage could be a resolution effect (i.e. arte- 
facts originating from correlated noise) or simply a result of 
FASTICA not being robust to noise, most likely the latter. 
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Figure 19. 2D power spectrum of the simulated 21-cm signal 
(black;solid), reconstructed 21-cm signal (red;points), residuals 
(blue, dash) and noise (pink, dotted) at 131 MHz, or 2=9.84, 151 
MHz, or 2=8.40 and 171 MHz, or 2=7.30 from top to bottom. 
Any error bars extending to below the x axis in linear space are 
shown extending to negative infinity in log space. 



4.3.2 3D Power Spectra 

To calculate the 3D power spectra we divide the cube into 
sub-bands of 8 MHz to avoid signal evolution effects. For 
each sub band we then carry out a 3D Fourier transform 
and calculate the 3D power spectrum in spherical annuli 
in Fourier space. The frequencies attached to the plots cor- 
respond to the centre of the sub band plotted. What we 

where 



Vk 3 (S(k)S*(k)) 
2^ 



actually plot is the quantity A§ D (fc) 
V is the volume. 

We find the same accurate recovery on scales above a 
few multiples of the PSF but with smaller errors due to the 
larger amount of data evaluated, Fig. |20| 



4-3.3 Cross Correlation Power Spectra 

To try and retrieve a more robust reconstructed 21-cm power 
spectrum, the cross correlation of two data cube realisa- 
tions was carried out. Two independent noise realisations 
were created and combined with identical foregrounds and 
21-cm signals to create two data cubes with the only differ- 
ence being the noise realisation. FASTICA was performed on 
both of these cubes separately, resulting in two residual files. 
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Figure 20. 3D power spectrum of the simulated 21-cm signal, 
reconstructed 21-cm signal, residuals and noise at 135 MHz, or 
z=9.51, 151 MHz, or z=8.40 and 175 MHz, or z=7.11 over an 
8 MHz sub band. Any points where the power of the residuals 
are below the power of the noise are omitted, as this leads to an 
unrealistic negative reconstructed 21-cm power. The error bars 
and linestyles are as described in Fig. |19| 
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Figure 21. Cross correlations of the two residuals (blue, cross), 
two reconstructed 21-cm signals (red, square), two fitting error 
estimates (pink, circle) and the auto-correlation of the simulated 
(black.solid) at 131 MHz, 151MHz and 171MHz. Only one set of 
error bars is shown for clarity. 



We carried out cross correlations on the two reconstructed 
cosmological signals, the two residual files and the two 21- 
cm fitting error estimates (i.e. reconstructed 21-cm minus 
the simulated 21-cm). By cross correlating the two residual 
signals consisting of two different noise realizations, we in- 
crease the amount of noise that will drop out in the noise 
cross terms, hopefully resulting in a more accurate power 
spectrum recovery when applied to real data. However we 
do not expect a significant improvement in comparison to 
our 2D autospectra here as we have already assumed a per- 
fect knowledge of the noise spectrum. Instead we do this 
as an example of a more robust method of power spectrum 
recovery for real data. Note that since correlations can be 
negative, it is the absolute value that is plotted. The errors 
bars on the cross spectra are calculated in the same way as 
for the auto spectra, namely: at = ilihi^AhjJl w here n^. is 

the number of pixels that resided in that k bin. The power 
spectra recovered as a result of this process are shown in 
Fig. [21] 

The cross correlations were also carried out on two noise 
realizations which were adjusted to have roughly 10 times 
the signal to noise ratio of the LOFAR realizations (similar 



to what is hoped for SKA), Fig. 22 We see that with a 



higher signal to noise ratio the auto and cross correlation 
estimates are significantly improved. 



4.4 Kurtosis and Skewness 

Skewness and, to a lesser extent, kurtosis have both been 
suggested as alternative statistics for the 21-cm detection 
due to their increased robustness to fitting errors compared 



to the variance ( |Harker et aL| 2009 ; Wyi the fc Morales|2007 1 



We define skewness (71) and kurtosis (72) in Equations |20| 



and [211 respectively. 



7i = 



72 



(20) 



(21) 



Kurtosis is defined in such a way that a Gaussian distribu- 
tion would have a kurtosis of zero. 

The structure of the 21-cm skewness and kurtosis for 
different source models has been discussed by Harker et al.| 



(20091; Wyithe & Morales (20071; Iliev et al. (20111. There 



is expected to be a skewness of the 21-cm signal as the signal 
becomes increasingly non-Gaussian as the regions of ionised 
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Figure 22. Cross correlations of the two residuals (blue, cross), 
two reconstructed 21-cm signals (red, square), two fitting error 
estimates (pink, circle) and the auto-correlation of the simulated 
(black, solid) and reconstructed cosmological signal (for one real- 
ization) (red, circles) at 150 MHz. The noise realizations involved 
have been adjusted to be 10 times smaller than the LOFAR real- 
izations. 
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Figure 24. The kurtosis of the simulated 21-cm signal plus noise 
(red), noise alone (black;dot) and residual maps (blue; dash). 
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Figure 23. The skewncss of the simulated 21-cm signal plus noise 
(red), noise alone (black;dot) and residual maps (blue; dash). 



Figure 25. The skewness of the simulated 21-cm signal (red; 
solid) and reconstructed 21-cm maps (blue; dash) for the fiducial 
signal and for different levels of Fourier filtering: 2,3 and 5 PSF 
scales (in reading order). 



hydrogen become more numerous. Simulations also show an 
increase in skewness at very low redshift due to a high bright- 
ness temperature tail related to regions with some remaining 
neutral hydrogen. 



Harker et al. ( 2009 1 employed a Wiener filter on the 



dirty residual data to denoise the images, recovering the 
general trends of the 21-cm skew. Kurtosis recovery proved 
more elusive. We present the skewness and kurtosis of the 
residuals cubes, Fig. |23| and Fig. [24] 

The skewness and kurtosis in the residual images is re- 
covered very well, accurately matching the simulated noise 
plus 21-cm signal skewness and kurtosis across the frequency 
range. 

We now manually subtract the noise cube from the 
residuals cube and plot the kurtosis/skewness of this recon- 
structed 21-cm signal, Fig. [25] and Fig. |26[ This amounts 



to assuming that we know the noise distribution perfectly 
which, though not viable for real data, allows us an insight 
into the recovered signal. 

We see that the skewness dip at low frequencies is only 
convincingly recovered with a high level of Fourier filtering. 
At the high frequencies, where the cosmological signal is very 
small, the skewness is not recovered. The dip in kurtosis 
at frequency 165 MHz is somewhat recovered for Fourier 
filtering below 2 PSF scales while it takes up to 5 PSF scales 
of k space filtering before the peak centred around 140MHz 
is recovered. For both statistics there is a divergence above 
frequencies of 180 MHz, where the cosmological signal is 
very small. 

All of the results presented in this section are for FAS- 
TICA being implemented in real space. While an implemen- 
tation was carried out in Fourier space, the general conclu- 
sions for all results remained the same. Though there were 
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Figure 26. The kurtosis of the simulated 21-cm signal (red; solid) 
and reconstructed 21-cm maps (blue; dash) for the fiducial signal 
and for different levels of Fourier filtering: 2,3 and 5 PSF scales 
(in reading order). 



small local variations in, for example, the recovered power 
spectrum points or kurtosis values, the graphs were for all 
intents and purposes duplications of the real space versions 
and are therefore not reproduced here. 



5 SENSITIVITY OF FASTICA 

So far in this paper we have assumed that the full field of 
view and frequency bandwidth of the simulation is input to 
FASTICA but we must also consider whether this method will 
be as successful under more constrained observations. 



5.1 Bandwidth of Observation 

Firstly, we assess the sensitivity to bandwidth and split the 
dirty data cube into two smaller cubes of bandwidth 42.5 
MHz, one from 115MHz - 157MHz and one from 157.5MHz 
- 199.5MHz. We perform FASTICA on each of these separately 
and measure the 2D power spectrum as described previously, 
Fig. [27} 

We can see that even for slices at the end of the cube 



frequency range (i.e. Fig. 27 top shows a slice 6 MHz from 
the end of that cube) the 21-cm reconstruction is successful. 
The general degradation is not unexpected as the more data 
a separation technique has to fix the foregrounds, the better 
the reconstruction will be. We conclude that the method is 
not sensitive to the point of endangering the signal recovery, 
but larger band widths are preferable. 

5.2 Noise 

Despite the encouraging results so far, the evidence of noise 
leakage in the recovered maps (Fig. 



16 \ and the variance 



recovery (Fig. 15 1 motivates us to consider the sensitivity of 



the 21-cm statistical recovery when there is increased noise 
in the observation. 

We have seen that by much reducing the expected LO- 
FAR noise to expected SKA levels, the 21-cm cross corre- 
lations power spectrum recovery is extremely accurate, Fig. 
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Figure 27. 2D power spectrum of the simulated 21-cm signal 
(black;solid), reconstructed 21-cm signal (red, points), residuals 
(blue, dash) and noise (pink, dotted) at 151 MHz (top) and 171 
MHz (bottom). The graphs are for data cubes of bandwidth 42.5 
MHz and ranges 115MHz - 157MHz and 157.5MHz - 199.5MHz 
respectively. Any error bars extending to below the x axis in linear 
space are shown extending to negative infinity in log space. 



|22| For completeness, here we set up some 'worst case' sce- 
narios, whereby we measure the recovered power spectra in 
the presence of two times, three times and five times the 
expected LOFAR noise, Fig. [28] 

As expected, the more noise present, the less accurately 
the 21-cm power spectrum is recovered. For twice the ex- 
pected level of noise we see the larger scales beginning to be 
overestimated, the extent of which worsens for three times 
the expected noise. For five times the expected amount 
of noise the power spectrum is significantly overestimated 
across the scale range. However we must stress that the fact 
that the 21-cm power spectrum is recovered across a wide 
scale range, even in the presence of twice the noise levels 
expected, can only be seen as extremely promising. 



5.3 Field of View 

In this paper we have assumed a 10° x 10° field of view, 
which is at the upper limit of what we can expect for LO- 
FAR observations. To explore the sensitivity of the analy- 
sis to the field of view, we now process a 2.5° x 2.5 ° data 
cube. If we had kept the noise and the resolution the same, 
analysing such a data cube would be plagued with noise 
as we would have reduced the number of pixels that we are 
analysing. Hence, we can choose to analyse a smaller patch in 
the sky with a higher resolution and same noise or decrease 
the noise and have similar resolution in order to have simi- 
lar constraining power as the fiducial analysis and establish 
the effect of the sky area coverage. In actual observations a 
decrease in field of view and an increase in resolution would 
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Figure 28. 2D power spectrum of the simulated 21-cm signal 
(black, solid), reconstructed 21-cm signal (red, points), residuals 
(blue, dash) and noise (pink, dotted) at 151 MHz, or 2=7.30. From 
top to bottom, the noise simulation is set at twice, three times and 
five times the expected LOFAR noise respectively. Any error bars 
extending to below the x axis in linear space are shown extending 
to negative infinity in log space. 

be related to the size of the stations and distribution of the 
stations respectively. If we had changed the resolution we 
would no longer correspond strictly speaking to a LOFAR 
case scenario. We therefore decide to decrease the field of 
view by a factor of 4 and enhance the signal to noise by 
a factor of 16. We see that the residuals are actually lower 
than the original 21-cm signal at the larger scales, Fig. |29| 
however the 21-cm power spectrum is still well recovered at 
the smaller scales. We interpret this as evidence that the 21- 
cm signal has been mixed into the other signals by FASTICA, 
potentially because FASTICA did not have as many lines of 
sight to remove the foregrounds, making the reconstruction 
less accurate though still successful. 



6 CONCLUSIONS 

We have presented a new implementation of a non- 
parametric foreground cleaning method using the FASTICA 
algorithm. FASTICA is an ICA technique which uses negen- 
tropy as a measure of non-Gaussianity. By maximising the 
non-Gaussianity of a signal mixture the ICs of the fore- 
grounds can be separated. FASTICA can then reconstruct the 
foregrounds, with any data not considered to be part of the 
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Figure 29. 2D power spectrum of the simulated 21-cm signal 
(black, solid), reconstructed 21-cm signal (red, points), residuals 
(blue, dash) and noise (pink, dotted) at 151 MHz, or 2=7.30, for a 
2.5° X 2.5° field of view. Any error bars extending to below the 
x axis in linear space are shown extending to negative infinity in 
log space. 



foregrounds forming the residuals. The residuals consist of 
the 21-cm signal, system noise and fitting errors. 

The success of using the FASTICA method to obtain an 
EoR signature was tested by attempting extraction of the 
two main statistical markers of the EoR, the 21-cm power 
spectrum and variance. The rms foreground fitting error is 
bounded below 10 mK across almost all of the frequency 
range when pixels with disproportionate errors due to un- 
usually small foreground values are discarded. 

Once the variance of the noise has been subtracted from 
the variance of the residuals, an excess variance is recovered. 
To accurately recover the 21-cm variance it was necessary to 
Fourier filter the data up to about 5 times the PSF scale. In 
this case the excess variance accurately recovers the order 
and shape of the simulated 21-cm variance across the ma- 
jority of the frequency range, failing only where the signal 
to noise is extremely low. 

The 21-cm angular power spectrum and 3D power spec- 
trum are recovered very well across a wide frequency range. 

Performing the ICA in Fourier space provides no partic- 
ular advantages or disadvantages according to the statistical 
tests carried out in this paper. This is in contrast to other 
methods which have shown preference towards processing in 
Fourier space ( jHarker et al.|2009[ ). 

The FASTICA method has proved not to be robust in the 
presence of large amounts of noise. Though impressive re- 
sults are obtained at large scales even for twice the expected 
levels of noise, levels above this endanger the recovery. 

We have shown that FASTICA can be a competitive fore- 
ground removal technique for EoR data, though for a full 
treatment of the LOFAR-EoR data, the polarisation of the 
simulated data and a more accurate frequency dependent 
PSF model needs to be considered in future work. 
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